                                  Readme for: 
          "Encouraging Community Action Against Teacher Absenteeism: A Mass Media Experiment in Rural Uganda"
                 Donald P. Green, Benjamin Tan, Anna M. Wilke

                                   Contents

- 01_data/                              =    Contains raw data
  - field_experiment/                   =    Data from main field experiment
    - cluster_level_data/               =    Data on the level of the village (trading center)
        - film_festival.csv             =    Data on intervention
        - sampling_radius.csv           =    Radii used for sampling of respondets in a given village
        - treatment_assignment.csv      =    Random assignment
    - codebooks/                        =    Questionnaire coding in surveyCTO
        - endline_choices.csv           =    Endline survey answer options
        - endline_Qs.csv                =    Endline sirvey questions
        - midline_choices.csv           =    Midline Survey answer options
        - midline_Qs.csv                =    Midline Survey questions
        - vht_endline_choices.csv       =    VHT endline survey answer options
        - vht_endline_Qs.csv            =    VHT endline survey questions
        - vht_midline_choices.csv       =    VHT midline survey answer options
        - vht_midline_Qs.csv            =    VHT midline survey questions
    - household/                        =    Data from household surveys
        - distance_data.csv             =    Distance of Rs to video hall
        - endline_1.csv                 =    Endline HH survey
        - endline_2.csv                 =    Endline HH survey
        - endline_3.csv                 =    Endline HH survey
        - endline_4.csv                 =    Endline HH survey
        - midline_1.csv                 =    Midline HH survey
        - midline_2.csv                 =    Midline HH survey
        - midline_3_1.csv               =    Midline HH survey
        - midline_3_2.csv               =    Midline HH survey
        - midline_3.csv                 =    Midline HH survey
        - midline_4.csv                 =    Midline HH survey
        - midline_5.csv                 =    Midline HH survey
    - vht/                              =    Data from surveys with village health teams (VHTs)
        - vht_el_1.csv                  =    VHT endline survey
        - vht_ml_1.csv                  =    VHT midline survey
        - vht_ml_2.csv                  =    VHT midline survey
        - vht_ml_3.csv                  =    VHT midline survey
        - vht_ml_4.csv                  =    VHT midline survey
  - lab_experiment                      =    Data from lab-in-the-field experiment
    - codebooks/                        =    Questionnaire coding in surveyCTO
      - lab_choices.csv                 =    Lab-in-the-field survey answer options
      - lab_Qs.csv                      =    Lab-in-the-field survey questions
    - lab.csv                           =    Lab-in-the field experiment data
    - LabAssignments.RData              =    Lab-in-the field experiment random assignment vectors
  - lasso_covariates/                   =    Covariates selected through lasso 
    - lasso_selected_covariates.Rdata   =    Covariates selected through lasso 
  - pilot_field_experiment              =    Data from pilot field experiment
    - cluster_level_data/               =    Data on the level of the village (trading center)
      pilot_film_festival.csv           =    Data on intervention
      pilot_treatment_assignment.csv    =    Random assignment
    - codebooks/                        =    Questionnaire coding in surveyCTO
      - pilot_choices.csv               =    Pilot survey answer options
      - pilot_Qs.csv                    =    Pilot survey questions
    - household/                        =    Data from household surveys
      pilot_endline.csv                 =    Endline HH survey pilot experiment
  - UG_absenteeism_data.Rdata           =    All datasets after cleaning, coding and imputing variables
- 02_code/                              =    Code scripts
  - _main_script.R                      =    Main script that runs others
  - 00_useful_functions/                =    Functions used throughout
  - 01_codebook/                        =    Scripts that build codebooks
  - 02_load_and_clean_data/             =    Loading and cleaning datasets
  - 03_variable_coding/                 =    Coding outcomes and covariates
  - 04_merging/                         =    Merging datasets
  - 05_covariate_selection/             =    Lasso covariate selection
  - 06_analyses/                        =    Analysis scripts
      - main_text/                      =    Code for tables and figures in main text
      - supplementary_material/         =    Code for tables and figures in appendix
- 03_tables/                            =    Tables are output here
- 04_figures/                           =    Figures are output here
- Absenteeism_replication.Rproj         =    Run everything from here


                            How to run the code 

1. Open Absenteeism_replication.Rproj to ensure that all file paths are set relative to
   the replication archive. 
2. Open _main_script.R and run all scripts from here
3. True / false logics switch on and off scripts that take a long time to run

                      Explanations and clarifications

- For the main field experiment, most analyses are based on a panel of "compliers" interviewed in both the 
  midline and endline surveys. This subset can be identified using the 
  subset respondent_category == "Complier". 
- We did not re-ask all questions of those in the panel, so their endline 
  responses are merged in from midline.
- The multiple versions of the raw data correspond to the different datasets
  output by ODK / CSO when a change is made to the survey. Each change requires
  a new survey version, thus producing a new dataset.
- The only change made to the raw data files is the removal of PII.
- All other modifications made to data (changing of values, etc.) in cleaning 
  scripts were implemented by field manager over course of field work. 

														   R Packages

The R package renv (0.14.0) has been used to ensure to handle dependency on packages.
Running all scripts from within the R project will ensure that the correct package versions will be used. 
The relevant files are contained in the folder called "renv."
